67 research outputs found
Toward Data Efficient Online Sequential Learning
Can machines optimally take sequential decisions over time? Since decades, researchers have been seeking an answer to this question, with the ultimate goal of unlocking the potential of artificial general intelligence (AGI) for a better and sustainable society. Many are the sectors that would be boosted by machines being able to take efficient sequential decisions over time. Let think at real-world applications such as personalized systems in entertainment (content systems) but also in healthcare (personalized therapy), smart cities (traffic control, flooding prevention), robots (control and planning), etc.. However, letting machines taking proper decisions in real-life is a highly challenging task. This is caused by the uncertainty behind such decisions (uncertainty on the actual reward, on the context, on the environment, etc.). A viable solution is to learn by experience (i.e., by trial and error), letting the machines uncover the uncertainty while taking decisions, and refining its strategy accordingly. However, such refinement is usually highly data-hungry (data-inefficiency), requiring a large amount of application specified data, leading to very slow learning processes -- hence very slow convergence to optimal strategies (curse of dimensionality). Luckily, data is usually intrinsically structured. Identifying and exploiting such structure substantially improves the data-efficiency of sequential learning algorithms. This is the key hypothesis underpinning the research in this thesis, in which novel structural learning methodologies are proposed for decision-making strategies problems such as Recommendation System (RS), Multi-armed Bandit (MAB) and Reinforcement Learning (RL), with the ultimate goal of making the learning process more (data)-efficient. Specifically, we tackle such goal from the perspective of modelling the problem structure as graphs, embedding tools from graph signal processing into decision learning theory.
As the first step, we study the application of graph-clustering techniques for RS, in which the curse of dimensionality is addressed by grouping data into clusters via graph-clustering techniques. Next, we exploit spectral graph structure for MAB problems, representing online learning problems. A key challenge is to learn sequentially the unknown bandit vector. Exploiting the smoothness-prior (i.e., bandit vector smooth on a given underpinning graph), we study theoretically the Laplacian-regularized estimator and provide both empirical evidences and theoretical analysis on the benefits of exploiting the graph structure in MABs. Then, we focus on the theoretical understanding of the Laplacian-regularized estimator. To this end, we derive a theoretical error upper bound on the estimator, which illustrates the impact of the alignment between the data and the graph structure as well as the graph spectrum on the estimation accuracy.
We then move to RL problems, focusing on the specific problem of learning a proper representation of the state-action (representation learning problem). Motivated by the fact that a good representation should be informative of the value function, we seek a learning algorithm able to preserve continuity between the value function and the representation space. Showing that state values are intrinsically correlated to the state transition dynamic structure and the diffusion of the reward on the MDP graph, we build a new loss function based on the newly defined diffusion distance and we propose a novel method to learn state representation with such desirable property.
In summary, in this thesis we address both theoretically and empirically important online sequential learning problems leveraging on the intrinsic data structure, showing the gain of the proposed solutions toward more data-efficient sequential learning strategies
Laplacian-regularized graph bandits: Algorithms and theoretical analysis
We consider a stochastic linear bandit problem with multiple users, where the
relationship between users is captured by an underlying graph and user
preferences are represented as smooth signals on the graph. We introduce a
novel bandit algorithm where the smoothness prior is imposed via the
random-walk graph Laplacian, which leads to a single-user cumulative regret
scaling as with time horizon ,
feature dimensionality , and the scalar parameter that
depends on the graph connectivity. This is an improvement over
in \algo{LinUCB}~\Ccite{li2010contextual},
where user relationship is not taken into account. In terms of network regret
(sum of cumulative regret over users), the proposed algorithm leads to a
scaling as , which is a significant
improvement over in the state-of-the-art
algorithm \algo{Gob.Lin} \Ccite{cesa2013gang}. To improve scalability, we
further propose a simplified algorithm with a linear computational complexity
with respect to the number of users, while maintaining the same regret.
Finally, we present a finite-time analysis on the proposed algorithms, and
demonstrate their advantage in comparison with state-of-the-art graph-based
bandit algorithms on both synthetic and real-world data
Data Driven Chiller Plant Energy Optimization with Domain Knowledge
Refrigeration and chiller optimization is an important and well studied topic
in mechanical engineering, mostly taking advantage of physical models, designed
on top of over-simplified assumptions, over the equipments. Conventional
optimization techniques using physical models make decisions of online
parameter tuning, based on very limited information of hardware specifications
and external conditions, e.g., outdoor weather. In recent years, new generation
of sensors is becoming essential part of new chiller plants, for the first time
allowing the system administrators to continuously monitor the running status
of all equipments in a timely and accurate way. The explosive growth of data
flowing to databases, driven by the increasing analytical power by machine
learning and data mining, unveils new possibilities of data-driven approaches
for real-time chiller plant optimization. This paper presents our research and
industrial experience on the adoption of data models and optimizations on
chiller plant and discusses the lessons learnt from our practice on real world
plants. Instead of employing complex machine learning models, we emphasize the
incorporation of appropriate domain knowledge into data analysis tools, which
turns out to be the key performance improver over state-of-the-art deep
learning techniques by a significant margin. Our empirical evaluation on a real
world chiller plant achieves savings by more than 7% on daily power
consumption.Comment: CIKM2017. Proceedings of the 26th ACM International Conference on
Information and Knowledge Management. 201
Disaster mechanism during passing of working face under overlying remnant coal pillar and advanced regional prevention technology
When the mining of the underlying working face of shallow and close seam passes under the overlying remnant coal pillar, it is easy to have an intensive mine pressure-induced dynamic disaster, resulting in personnel and equipment damage, which seriously threatens the safety of mine production. The characteristics induction, numerical simulation calculation, mechanical model analysis and other research methods are used to clarify the occurrence of the hazards of the overlying remnant coal pillars in the shallow and close seams, and reveal the disaster mechanism caused by intensive mining pressure. The research shows that the disaster mechanism of the intensive ground pressure caused by the overlying remnant coal pillar is that when the working face passes under the coal pillar, the coal pillar and the overlying bearing body are disturbed and suddenly lose stability, and the energy is transferred to the stope instantly, which is released in the form of kinetic energy, resulting in the intensive ground pressure-induced dynamic disaster. Based on the prevention and control idea of “collapsed rock support+weakening of key rock stratum+transfer of stress transmission path”, the prevention and control technology of weakening for front area using subsectional-hydraulic fracturing was proposed by modifying the coal pillar and bearing body migration space, weakening key rock stratum, uniformly distributing concentrated stress and transferring stress transmission path, and engineering tests are carried out at typical working faces. The engineering test results show that during the implementation of hydraulic fracturing, the peak value of pumping pressure reaches 23.4 MPa, the pressure changes generally in a “zigzag” shape, accompanied by a sudden drop of pressure for more than 60 times, and the artificial main fractures and micro fractures in the rock mass continue to develop alternately, effectively destroying the integrity of the rock mass; After treatment, the peak value and average value of periodic pressure decreased by 15.41% and 8.29% respectively, and the peak value and average dynamic load coefficient decreased by 17.39% and 11.88%, respectively. The maximum contraction of the shield cylinder was 50.00% and less than 0.4 m, and the maximum contraction of the gate road roof was 33.33%. The working face safely passed through the affected area of the overlying remnant coal pillar, and the advanced area weakening technology of subsectional- hydraulic fracturing can effectively prevent and control the intensive ground pressure disaster of the overlying remnant coal pillar in shallow and close seams
Low-dose computed tomography for lung cancer screening in Anhui, China: A randomized controlled trial
BackgroundLung cancer is the leading cause of cancer-related death worldwide, with risk factors such as age and smoking. Low-dose computed tomography screening can reduce lung cancer mortality. However, its effectiveness in Asian populations remains unclear. Most Asian women with lung cancer are non-smokers who have not been screened. We conducted a randomized controlled trial to evaluate the performance of low-dose computed tomography screening in a Chinese population, including high-risk smokers and non-smokers exposed to passive smoking. The baseline data are reported in this study.MethodsBetween May and December 2019, eligible participants were randomized in a ratio of 1:1:1 to a screening (two arms) or control cohort. Non-calcified nodules/masses with a diameter >4 mm on low-dose computed tomography were considered positive findings.ResultsIn total, 600 patients (mean age, 59.1 ± 6.9 years) underwent low-dose computed tomography. Women accounted for 31.5% (189/600) of patients; 89.9% (170/189) were non-smokers/passive smokers. At baseline, the incidence of lung cancer was 1.8% (11/600). The incidence of lung cancer was significantly lower in smokers than in female non-smokers/passive smokers (1.0% [4/415] vs. 4.1% [7/170], respectively; P=0.017). Stage 0–I lung cancer accounted for 90.9% (10/11) of cases.ConclusionsWe demonstrate the importance of including active smokers and female non-smokers/passive smokers in lung cancer screening programs. Further studies are needed to explore the risk factors, and long-term cost–benefit of screening Asian non-smoking women.Clinical trial registrationhttp://chictr.org.cn/showproj.aspx?proj=39003, identifier ChiCTR1900023197
The TP53-Related Signature Predicts Immune Cell Infiltration, Therapeutic Response, and Prognosis in Patients With Esophageal Carcinoma
TP53 mutation (TP53MUT) is one of the most common gene mutations and frequently occurs in many cancers, especially esophageal carcinoma (ESCA), and it correlates with clinical prognostic outcomes. Nevertheless, the mechanisms by which TP53MUT regulates the correlation between ESCA and prognosis have not been sufficiently studied. Here, in the current research, we constructed a TP53MUT-related signature to predict the prognosis of patients with esophageal cancer and successfully verified this model in patients in the TP53 mutant group, esophageal squamous cell carcinoma group, and adenocarcinoma group. The risk scores proved to be better independent prognostic factors than clinical features, and prognostic features were combined with other clinical features to establish a convincing nomogram to predict overall survival from 1 to 3 years. In addition, we further predicted the tumor immune cell infiltration, chemical drugs, and immunotherapy responses between the high-risk group and low risk group. Finally, the gene expression of the seven-gene signature (AP002478.1, BHLHA15, FFAR2, IGFBP1, KCTD8, PHYHD1, and SLC26A9) can provide personalized prognosis prediction and insights into new treatments
- …